Calculating a sum of numbers using an overflow counter in an environment exceeded by the numbers in bit-size

ABSTRACT

Logic for calculating a sum of numbers using an overflow counter in an environment exceeded by the numbers in bit-size accesses a least significant portion of a first number, accesses a least significant portion of a second number, and adds the least significant portion of the first number to the least significant portion of the second number. The resulting sum includes a first intermediate number. If a carry is generated by the addition of the least significant portion of the first number to the least significant portion of the second number, the logic increments an overflow counter to record the generated carry. The logic accesses least significant portions of the remaining numbers, adds the least significant portions to the first intermediate number, and increments the overflow counter each time a carry is generated to record the generated carry. After each of the least significant portions has been added to the first intermediate number, the logic stores the first intermediate number. The first intermediate number includes a least significant portion of the sum of the multiple numbers. The logic accesses a most significant portion of the first number and adds the overflow counter to the most significant portion of the first number. The resulting sum includes a second intermediate number. The logic accesses most significant portions of the remaining numbers and adds each of the most significant portions to the second intermediate number. After each of the most significant portions has been added to the second intermediate number, the logic stores the second intermediate number. The second intermediate number includes a most significant portion of the sum of the multiple numbers.

TECHNICAL FIELD OF THE INVENTION

[0001] This invention relates generally to processor operations and more particularly to calculating a sum of numbers in an environment exceeded by the numbers in bit-size using an overflow counter.

BACKGROUND OF THE INVENTION

[0002] In a 32-bit environment, sums of 64-bit numbers are typically calculated according to the following algorithm. The least significant thirty-two bits of the first 64-bit number are added to the least significant thirty-two bits of the second 64-bit number. The resulting sum includes a carry (which includes either a one or a zero) and the least significant thirty-two bits of a first 64-bit intermediate sum. The carry and the most significant thirty-two bits of the second 64-bit number are then added to the most significant thirty-two bits of the first 64-bit number, and the resulting sum includes the most significant thirty-two bits of the first 64-bit intermediate sum. The least significant thirty-two bits of the first 64-bit intermediate sum are then added to the least significant thirty-two bits of the third 64-bit number, and the resulting sum includes a carry and the least significant thirty-two bits of a second 64-bit intermediate sum. The carry and the most significant thirty-two bits of the third 64-bit number are then added to the most significant thirty-two bits of the first 64-bit intermediate sum, and the resulting sum includes the most significant thirty-two bits of the second 64-bit intermediate sum. The least significant thirty-two bits of the second 64-bit intermediate sum are then added to the least significant thirty-two bits of the fourth 64-bit number, and the resulting sum includes a carry and the least significant thirty-two bits of a third 64-bit intermediate sum. The carry and the most significant thirty-two bits of the fourth 64-bit number are then added to the most significant thirty-two bits of the second 64-bit intermediate sum, and the resulting sum includes the most significant thirty-two bits of the third 64-bit intermediate sum. This continues until the least and most significant thirty-two bits of the final 64-bit number are added to the least and most significant thirty-two bits of the preceding 64-bit intermediate sum, respectively. The final resulting sums include the least and most significant thirty-two bits of the sum of the 64-bit numbers. A drawback of such an algorithm is that two 32-bit registers are typically required to store the 64-bit intermediate sums, which may adversely affect processor performance and operation efficiency.

SUMMARY OF THE INVENTION

[0003] Particular embodiments of the present invention may reduce or eliminate disadvantages and problems traditionally associated with calculating a sum of a plurality of numbers in an environment exceeded by the numbers in bit-size using an overflow counter.

[0004] In one embodiment of the present invention, logic for calculating a sum of numbers using an overflow counter in an environment exceeded by the numbers in bit-size accesses a least significant portion of a first number of multiple numbers, accesses a least significant portion of a second number of the multiple numbers, and adds the least significant portion of the first number to the least significant portion of the second number. The resulting sum includes a first intermediate number. If a carry is generated by the addition of the least significant portion of the first number to the least significant portion of the second number, the logic accesses an overflow counter and increments the overflow counter to record the generated carry. The logic accesses each of multiple least significant portions of the remaining multiple numbers, adds each of the multiple least significant portions to the first intermediate number, and accesses and increments the overflow counter each time a carry is generated to record the generated carry. After each of the multiple least significant portions has been added to the first intermediate number, the logic stores the first intermediate number. The first intermediate number includes a least significant portion of the sum of the multiple numbers. The logic accesses a most significant portion of the first number and adds the overflow counter to the most significant portion of the first number. The resulting sum includes a second intermediate number. The logic accesses each of multiple most significant portions of the remaining multiple numbers and adds each of the multiple most significant portions to the second intermediate number. After each of the multiple most significant portions has been added to the second intermediate number, the logic stores the second intermediate number. The second intermediate number includes a most significant portion of the sum of the multiple numbers.

[0005] Particular embodiments of the present invention may provide one or more technical advantages. In particular embodiments, a sum of numbers may be calculated in an environment exceeded by the numbers in bit-size using a single register and an overflow counter, which may improve processor performance and operation efficiency. Certain embodiments may provide one or more other technical advantages which may be readily apparent to those skilled in the art from the figures, descriptions, and claims included herein.

BRIEF DESCRIPTION OF THE DRAWINGS

[0006] To provide a more complete understanding of the present invention and the features and advantages thereof, reference is made to the following description, taken in conjunction with the accompanying drawings, in which:

[0007]FIG. 1 illustrates an example processor system; and

[0008]FIG. 2 illustrates an example method for calculating a sum of 64-bit numbers in a 32-bit environment using an overflow counter.

DESCRIPTION OF EXAMPLE EMBODIMENTS

[0009]FIG. 1 illustrates an example processor system 10, which may include a digital signal processor (DSP). Although a particular processor system 10 is described and illustrated, the present invention contemplates any suitable processor system 10 including any suitable architecture. Processor system 10 may include program memory 12, data memory 14, and processor 16. Program memory 12 may be used to store program instructions for operations executed by processor 16, and data memory 14 may be used to store data used in operations executed by processor 16. Data (which may include program instructions, data used in operations executed by processor 16, or any other suitable data) may be communicated between processor 16 and program memory 12 and between processor 16 and data memory 14 using data buses 18, which may include any suitable physical medium for such communication. For example, data buses 18 may include one or more wires coupling processor 16 to program memory 12 and data memory 14. The number of bits that may be communicated across a data bus 18 in one clock cycle (which may include a unit of time between two adjacent pulses of a clock signal for processor system 10) may be limited. For example, in a 32-bit environment, a maximum of thirty-two bits may be communicated across each data bus 18 in one clock cycle. Data addresses (which may specify locations for data within program memory 12, data memory 14, or elsewhere and may, where appropriate, include the locations themselves) may be communicated between processor 16 and program memory 12 and between processor 16 and data memory 14 using address buses 20, which may include any suitable physical medium for such communication. For example, address buses 20 may include one or more wires coupling processor 16 with program memory 12 and data memory 14. Similar to data buses 18, the number of bits that may be communicated across an address bus 20 in one clock cycle may be limited.

[0010] Processor 16 may execute mathematical, logical, and any other suitable operations and may, for example only and not by way of limitation, include one or more shifters 22, arithmetic-logic units (ALUs) 24, multipliers 26, data registers 28, instruction caches 30, program sequencers 32, and data address generators 34. Although a particular processor 16 is described and illustrated, the present invention contemplates any suitable processor 16 including any suitable components. Shifter 22 may be used to left- or right-shift data units and perform other suitable tasks. ALU 24 may be used for addition, subtraction, absolute value operations, logical operations (such as, for example, AND, OR, NAND, NOR, and NOT operations), and other suitable tasks. Multiplier 26 may be used for multiplication and other suitable tasks. In a 32-bit environment, shifter 22, ALU 24, and multiplier 26 may each process a maximum of thirty-two bits in one clock cycle. For example, ALU 24 may in one clock cycle add numbers that include at most thirty-two bits. To add numbers that include more than thirty-two bits, the numbers may be divided into parts that each include thirty-two or fewer bits and added in parts.

[0011] Registers 28 may include a number of memory locations for storing intermediate operation results, flags for program control, and the like. For example, registers 28 may include one or more general data registers, temporary registers, condition code registers (CCRs), status registers (SRs), address registers, and other suitable registers. In a 32-bit environment, each register 28 may be used to store a maximum of thirty-two bits. Instruction cache 30 may be used to store one or more program instructions for recurring operations. For example, program instructions for one or more operations that are part of a loop of operations executed by processor 16 may be stored using instruction cache 30 such that program memory 12 need not be accessed each time a program instruction for one or more of the operations is to be executed. Program sequencer 32 may direct the execution of operations by processor 16 and perform other suitable tasks. Data address generators 34 may communicate addresses to program memory 12 and data memory 14 specifying memory locations within program memory 12 and data memory 14 from which data may be read and to which data may be written. Although particular components of processor 16 are described as performing particular tasks, any suitable components of processor 16, alone or in combination, may perform any suitable tasks. In addition, although the components of processor 16 are described and illustrated as separate components, any suitable component of processor 16 may be wholly or partly incorporated into one or more other components of processor 16.

[0012] Sums of 64-bit numbers may be calculated by processor system 10. Equations for such calculations may include the following:

Y=X1+X2 +X3 +. . . +Xn

[0013] Y may include a 64-bit number, and X1 through Xn may also include 64-bit numbers. Y and X1 through Xn may be stored in memory locations within data memory 14, elsewhere within processor system 10, or outside processor system 10.

[0014] In a 32-bit environment, sums of 64-bit numbers have traditionally been calculated according to the following algorithm, which may be called “summation by parts.” The least significant thirty-two bits of the first 64-bit number are added to the least significant thirty-two bits of the second 64-bit number. The resulting sum includes a carry (which includes either a one or a zero) and the least significant thirty-two bits of a first 64-bit intermediate sum. The carry and the most significant thirty-two bits of the second 64-bit number are then added to the most significant thirty-two bits of the first 64-bit number, and the resulting sum includes the most significant thirty-two bits of the first 64-bit intermediate sum. The least significant thirty-two bits of the first 64-bit intermediate sum are then added to the least significant thirty-two bits of the third 64-bit number, and the resulting sum includes a carry and the least significant thirty-two bits of a second 64-bit intermediate sum. The carry and the most significant thirty-two bits of the third 64-bit number are then added to the most significant thirty-two bits of the first 64-bit intermediate sum, and the resulting sum includes the most significant thirty-two bits of the second 64-bit intermediate sum. The least significant thirty-two bits of the second 64-bit intermediate sum are then added to the least significant thirty-two bits of the fourth 64-bit number, and the resulting sum includes a carry and the least significant thirty-two bits of a third 64-bit intermediate sum. The carry and the most significant thirty-two bits of the fourth 64-bit number are then added to the most significant thirty-two bits of the second 64-bit intermediate sum, and the resulting sum includes the most significant thirty-two bits of the third 64-bit intermediate sum. This continues until the least and most significant thirty-two bits of the final 64-bit number are added to the least and most significant thirty-two bits of the preceding 64-bit intermediate sum, respectively. The final resulting sums include the least and most significant thirty-two bits of the sum of the 64-bit numbers.

[0015] Such an algorithm may be described as follows: RegA(low32) = X1(low32) RegB(high32) = X1(high32) RegA(low32) = RegA(low32) + X2(low32), C = 1 if overflow,   else C = 0 RegB(high32) = RegB(high32) + X2(high32) + C RegA(low32) = RegA(low32) + X3(low32), C = 1 if overflow,   else C = 0 RegB(high32) = ReqB(high32) + X3(high32) + C . . . RegA(low32) = RegA(low32) + Xn(low32), C = 1 if overflow,   else C = 0 RegB(high32) = RegB(high32) + Xn(high32) + C Y(low32) = RegA(low32) Y(high32) = RegB(high32)

[0016] RegA and RegB may include the least significant thirty-two bits and the most significant thirty-two bits, respectively, of the 64-bit intermediate results, and may be stored in registers 28. X 1 (low32) and X 1 (high32) may include the least significant thirty-two bits and the most significant thirty-two bits, respectively, of the first 64-bit number of the 64-bit numbers, X2 (low32) and X2 (high32) may include the least significant thirty-two bits and the most significant thirty-two bits, respectively, of the second 64-bit number of the 64-bit numbers, and so on, and may be stored in memory locations within data memory 14. Y (low 32) and Y(high32) may include the least and most significant thirty-two bits, respectively, of the sum of the 64-bit numbers and may be stored in memory locations within data memory 14. A drawback of such an algorithm is that two registers 28, RegA and RegB, are required to store 64-bit intermediate results.

[0017] In particular embodiments, sums of 64-bit numbers may be calculated using a single register 28 and an overflow counter, which may be stored in a status register 28 within processor 16 or any other suitable location within or outside processor system 10. In such embodiments, the least significant thirty-two bits of the sum of a 64-bit numbers may be calculated and generated carries may be recorded using the overflow counter (which may be incremented by one every time a carry is generated). The most significant thirty-two bits of the sum of the 64-bit numbers may then be calculated, taking into account carries from the calculation of the least significant thirty-two bits of the sum of the 64-bit numbers recorded using the overflow counter. The overflow counter may be stored in one or more status registers 28 or other suitable locations within or outside processor system 10. The overflow counter may include any suitable number of bits, which number may determine the number of sequential additions the overflow counter may accommodate. For example, a 6-bit overflow counter may accommodate 2⁶ (sixty-four) sequential additions. An algorithm for calculating sums of 64-bit numbers using an overflow counter may be described as follows: ; Calculate Low Thirty-Two Bits OVCU = 0 RegA(low32) = X1(low32) RegA(low32) = RegA(low32) + X2(low32), increment OVCU if overflow RegA(low32) = RegA(low32) + X3(low32), increment OVCU if overflow . . . RegA(low32) = RegA(low32) + Xn(low32), increment OVCU if overflow Y(low32) = RegA(low32) ; Calculate High Thirty-Two Bits RegA = OVCU RegA(high32) = RegA(high32) + X1(high32) RegA(high32) = RegA(high32) + X2(high32) RegA(high32) = RegA(high32) + X3(high32) . . . RegA(high32) = RegA(high32) + Xn(high32) Y(high32) = RegA(high32)

[0018] RegA may be stored in a register 28 or other suitable location within or outside processor system 10. X 1 (low32) and X 1 (high32) may include the least significant thirty-two bits and the most significant thirty-two bits, respectively, of the first 64-bit number of the 64-bit numbers, X2 (low32) and X2 (high32) may include the least significant thirty-two bits and the most significant thirty-two bits, respectively, of the second 64-bit number of the 64-bit numbers, and so on. These bits may be stored in memory locations within data memory 14, elsewhere within processor system 10, or outside processor system 10. Y (low 32) and Y (high32) may include the least and most significant thirty-two bits, respectively, of the sum of the 64-bit numbers and may be stored in memory locations within data memory 14, elsewhere within processor system 10, or outside processor system 10. Such an algorithm may require only one register for storing intermediate sums and may enable the use of repeat operations, which may reduce code size and improve processor performance. Such an algorithm, in particular embodiments, may also be described as follows: ; Calculate Low Thirty-Two Bits RegA(low32) = *Source(low32); RPT #N | | ADDUL RegA,*Source(low32)++                 ; // Increment OVCU if overflow Y(low32) = RegA; ;Calculate High Thirty-Two Bits RegA(high32) = OVCU; RPT #N | | ADDL RegA,*Source(high32)++; Y(high32) = RegA;

[0019] Although sums of 64-bit numbers calculated in a 32-bit environment have been described, the present invention contemplates sums of numbers of any suitable bit-length calculated in any suitable environment where the size of the numbers exceeds the size of one of more ALUs 24 or other components of a processor system 10. For example, the algorithm described above for calculating sums of 64-bit numbers in a 32-bit environment may be used to calculate sums of 128-bit numbers in a 64-bit environment. Although sums of numbers have been described, the present invention contemplates any suitable operations (which may include additions, subtractions, or both). For example, numbers may be subtracted according to the algorithm described above and generated borrows may be recorded using the overflow counter (which may be decremented by one every time a borrow is generated).

[0020]FIG. 2 illustrates an example method for calculating a sum of 64-bit numbers in a 32-bit environment using an overflow counter. The method begins at step 100, where the least significant thirty-two bits of the first 64-bit number of the 64-bit number are accessed. As described above, the 64-bit numbers may be stored in memory locations within data memory 15 or any other suitable location. At step. 102, the least significant thirty-two bits of the second 64-bit number are accessed. At step 104, the least significant thirty-two bits of the first 64-bit number are added to the least significant thirty-two bits of the second 64-bit number, resulting in an intermediate 32-bit sum for the least significant thirty-two bits of the sum of 64-bit numbers. As described above, the intermediate 32-bit sum may be stored in a register 28 or any other suitable location. At step 106, if a carry was generated by the addition of the least significant thirty-two bits of the first 64-bit number to the least significant thirty-two bits of the second 64-bit number, the method proceeds to step 108. At step 108, an overflow counter is incremented to record the generated carry. As described above, the overflow counter may be stored in a status register 28 or any other suitable location. At step 106, if a carry was not generated by the addition of the least significant thirty-two bits of the first 64-bit number to the least significant thirty-two bits of the second 64-bit number, the method proceeds to step 110.

[0021] At step 110, the least significant thirty-two bits of the next 64-bit number are accessed. At step 112, the least significant thirty-two bits of the 64-bit number accessed at step 110 are added to the intermediate 32-bit sum for the least significant thirty-two bits of the sum of 64-bit numbers. At step 114, if a carry was generated by the addition of the least significant thirty-two bits of the 64 number accessed at step 110 to the intermediate 32-bit sum for the least significant thirty-two bits of the sum of 64-bit numbers, the method proceeds to step 116. At step 116, the overflow counter is incremented to record the generated carry. At step 114, if a carry was not generated by the addition of the least significant thirty-two bits of the 64-bit number accessed at step 110 to the intermediate 32-bit sum for the least significant thirty-two bits of the sum of 64-bit numbers, the method proceeds to step 118. At step 118, if the least significant thirty-two bits of the final 64-bit number have not been added to the intermediate 32-bit sum for the least significant thirty-two bits of the sum of 64-bit numbers, the method returns to step 110. At step 118, if the least significant thirty-two bits of the final 64-bit number have been added to the intermediate 32-bit sum for the least significant thirty-two bits of the sum of 64-bit numbers, the method proceeds to step 120. At step 120, the intermediate 32-bit sum for the least significant thirty-two bits of the sum of 64-bit numbers (which, after the addition of the least significant thirty-two bits of the final 64-bit number, includes the least significant thirty-two bits of the sum of the 64-bit numbers) is stored.

[0022] At step 122, the overflow counter is accessed. At step 124, the most significant thirty-two bits of the first 64-bit number are accessed. At step 126, the most significant thirty-two bits of the first 64-bit number are added to the overflow counter, resulting in an intermediate 32-bit sum for the most significant thirty-two bits of the sum of the 64-bit numbers. As described above, the intermediate 32-bit sum for the most significant thirty-two bits of the sum of the 64-bit numbers may be stored in the same register in which the intermediate 32-bit sum for the least significant thirty-two bits of the sum of the 64-bit numbers was stored. At step 128, the most significant thirty-two bits of the next 64-bit number are accessed. At step 130, the most significant thirty two bits of the 64-bit number accessed at step 128 are added to the intermediate 32-bit sum for the most significant thirty-two bits of the sum of the 64-bit numbers. At step 132, if the most significant thirty-two bits of the final 64-bit number have not been added to the intermediate 32-bit sum for the most significant thirty-two bits of the sum of the 64-bit numbers, the method returns to step 128. At step 132, if the most significant thirty-two bits of the final 64-bit number have been added to the intermediate 32-bit sum for the most significant thirty-two bits of the sum of the 64-bit numbers, the method proceeds to step 134. At step 134, the intermediate sum for the most significant thirty-two bits of the sum of the 64-bit numbers (which, after the addition of the most significant thirty-two bits of the final 64-bit number, includes the most significant thirty-two bits of the sum of the 64-bit numbers) is stored, at which point the method ends.

[0023] Although a method for calculating the sum of 64-bit numbers in a 32-bit environment has been described, the present invention, as described above, contemplates sums of numbers of any suitable bit-length calculated in any suitable environment where the size of the numbers exceeds the size of one of more ALUs 24 or other components of a processor system 10. Additionally, although a method for calculating the sum of numbers has been described, the present invention, as described above, contemplates any suitable operations (which may include additions, subtractions, or both).

[0024] Although the present invention has been described with several embodiments, sundry changes, substitutions, variations, alterations, and modifications may be suggested to one skilled in the art, and it is intended that the invention may encompass all such changes, substitutions, variations, alterations, and modifications falling within the spirit and scope of the appended claims. 

what is claimed is:
 1. Logic for calculating a sum of numbers using an overflow counter in an environment exceeded by the numbers in bit-size, the logic encoded in media and when executed operable to: access a least significant portion of a first number of a plurality of numbers; access a least significant portion of a second number of the plurality of numbers; add the least significant portion of the first number to the least significant portion of the second number, the resulting sum comprising a first intermediate number; if a carry is generated by the addition of the least significant portion of the first number to the least significant portion of the second number, increment an overflow counter to record the generated carry; access each of a plurality of least significant portions of the remaining plurality of numbers, add each of the plurality of least significant portions to the first intermediate number, and increment the overflow counter each time a carry is generated to record the generated carry; after each of the plurality of least significant portions has been added to the first intermediate number, store the first intermediate number, the first intermediate number comprising a least significant portion of the sum of the plurality of numbers; access a most significant portion of the first number; add the overflow counter to the most significant portion of the first number, the resulting sum comprising a second intermediate number; access each of a plurality of most significant portions of the remaining plurality of numbers and add each of the plurality of most significant portions to the second intermediate number; and after each of the plurality of most significant portions has been added to the second intermediate number, store the second intermediate number, the second intermediate number comprising a most significant portion of the sum of the plurality of numbers.
 2. The logic of claim 1, wherein: each of the plurality of numbers is stored in one or more memory locations within data memory; the overflow counter is stored in a status register; the first intermediate number is stored in a single register; and the second intermediate number is stored in the same register as the first intermediate number.
 3. The logic of claim 1, wherein: the bit-size of the plurality of numbers comprises sixty-four bits; and the bit-size of the environment comprises thirty-two bits.
 4. The logic of claim 1, encoded in a digital signal processor (DSP).
 5. The logic of claim 1, operable to: subtract a least significant portion of a number of the plurality of numbers from a least significant portion of another number of the plurality of numbers; and if a borrow is generated by the subtraction of the least significant portion of the number from the least significant portion of the other number, decrement the overflow counter to record the generated borrow.
 6. A method for calculating a sum of numbers using an overflow counter in an environment exceeded by the numbers in bit-size, the method comprising: accessing a least significant portion of a first number of a plurality of numbers; accessing a least significant portion of a second number of the plurality of numbers; adding the least significant portion of the first number to the least significant portion of the second number, the resulting sum comprising a first intermediate number; if a carry is generated by the addition of the least significant portion of the first number to the least significant portion of the second number, incrementing an overflow counter to record the generated carry; accessing each of a plurality of least significant portions of the remaining plurality of numbers, adding each of the plurality of least significant portions to the first intermediate number, and incrementing the overflow counter each time a carry is generated to record the generated carry; after each of the plurality of least significant portions has been added to the first intermediate number, storing the first intermediate number, the first intermediate number comprising a least significant portion of the sum of the plurality of numbers; accessing a most significant portion of the first number; adding the overflow counter to the most significant portion of the first number, the resulting sum comprising a second intermediate number; accessing each of a plurality of most significant portions of the remaining plurality of numbers and adding each of the plurality of most significant portions to the second intermediate number; and after each of the plurality of most significant portions has been added to the second intermediate number, storing the second intermediate number, the second intermediate number comprising a most significant portion of the sum of the plurality of numbers.
 7. The method of claim 6, wherein: each of the plurality of numbers is stored in one or more memory locations within data memory; the overflow counter is stored in a status register; the first intermediate number is stored in a single register; and the second intermediate number is stored in the same register as the first intermediate number.
 8. The method of claim 6, wherein: the bit-size of the plurality of numbers comprises sixty-four bits; and the bit-size of the environment comprises thirty-two bits.
 9. The method of claim 6, executed by a digital signal processor (DSP).
 10. The method of claim 6, comprising: subtracting a least significant portion of a number of the plurality of numbers from a least significant portion of another number of the plurality of numbers; and if a borrow is generated by the subtraction of the least significant portion of the number from the least significant portion of the other number, decrementing the overflow counter to record the generated borrow.
 11. Logic for calculating a sum of 64-bit numbers using an overflow counter in a 32-bit environment, the logic encoded in a digital signal processor (DSP) and when executed operable to: access a least significant 32-bit portion of a first 64-bit number of a plurality of 64-bit numbers, each of the plurality of 64-bit numbers stored in two memory locations within data memory for the DSP; access a least significant 32-bit portion of a second 64-bit number of the plurality of 64-bit numbers; add the least significant 32-bit portion of the first 64-bit number to the least significant portion of the second 64-bit number, the resulting sum comprising a first intermediate 32-bit number; if a carry is generated by the addition of the least significant 32-bit portion of the first 64-bit number to the least significant 32-bit portion of the second 64-bit number, increment an overflow counter to record the generated carry, the overflow counter stored in a status register of the DSP; access each of a plurality of least significant 32-bit portions of the remaining plurality of 64-bit numbers, add each of the plurality of least significant 32-bit portions to the first intermediate 32-bit number, and increment the overflow counter each time a carry is generated to record the generated carry; after each of the plurality of least significant 32-bit portions has been added to the first intermediate 32-bit number, store the first intermediate 32-bit number, the first intermediate 32-bit number comprising a least significant 32-bit portion of the sum of the plurality of 64-bit numbers; access a most significant 32-bit portion of the first 64-bit number; add the overflow counter to the most significant 32-bit portion of the first 64-bit number, the resulting sum comprising a second intermediate 32-bit number; access each of a plurality of most significant 32-bit portions of the remaining plurality of 64-bit numbers and add each of the plurality of most significant 32-bit portions to the second intermediate 32-bit number; and after each of the plurality of most significant 32-bit portions has been added to the second intermediate 32-bit number, store the second intermediate 32-bit number, the second intermediate 32-bit number comprising a most significant 32-bit portion of the sum of the plurality of 64-bit numbers.
 12. The logic of claim 11, operable to: subtract a least significant 32-bit portion of a number of the plurality of 64-bit numbers from a least significant 32-bit portion of another 64-bit number of the plurality of 64-bit numbers; and if a borrow is generated by the subtraction of the least significant 32-bit portion of the 64-bit number from the least significant 32-bit portion of the other 64-bit number, decrement the overflow counter to record the generated borrow.
 13. A method for calculating a sum of 64-bit numbers using an overflow counter in a 32-bit environment, the method executed by a digital signal processor (DSP) and comprising: accessing a least significant 32-bit portion of a first 64-bit number of a plurality of 64-bit numbers, each of the plurality of 64-bit numbers stored in two memory locations within data memory for the DSP; accessing a least significant 32-bit portion of a second 64-bit number of the plurality of 64-bit numbers; adding the least significant 32-bit portion of the first 64-bit number to the least significant portion of the second 64-bit number, the resulting sum comprising a first intermediate 32-bit number; if a carry is generated by the addition of the least significant 32-bit portion of the first 64-bit number to the least significant 32-bit portion of the second 64-bit number, incrementing an overflow counter to record the generated carry, the overflow counter stored in a status register of the DSP; accessing each of a plurality of least significant 32-bit portions of the remaining plurality of 64-bit numbers, adding each of the plurality of least significant 32-bit portions to the first intermediate 32-bit number, and incrementing the overflow counter each time a carry is generated to record the generated carry, after each of the plurality of least significant 32-bit portions has been added to the first intermediate 32-bit number, storing the first intermediate 32-bit number, the first intermediate 32-bit number comprising a least significant 32-bit portion of the sum of the plurality of 64-bit numbers; accessing a most significant 32-bit portion of the first 64-bit number; adding the overflow counter to the most significant 32-bit portion of the first 64-bit number, the resulting sum comprising a second intermediate 32-bit number; accessing each of a plurality of most significant 32-bit portions of the remaining plurality of 64-bit numbers and adding each of the plurality of most significant 32-bit portions to the second intermediate 32-bit number; and after each of the plurality of most significant 32-bit portions has been added to the second intermediate 32-bit number, storing the second intermediate 32-bit number, the second intermediate 32-bit number comprising a most significant 32-bit portion of the sum of the plurality of 64-bit numbers.
 14. The method of claim 13, comprising: subtracting a least significant 32-bit portion of a number of the plurality of 64-bit numbers from a least significant 32-bit portion of another 64-bit number of the plurality of 64-bit numbers; and if a borrow is generated by the subtraction of the least significant 32-bit portion of the 64-bit number from the least significant 32-bit portion of the other 64-bit number, decrementing the overflow counter to record the generated borrow. 